$P a g e R a n k$

Welcome back! In the last post we derived Eigenvectors. In particular, we saw how useful they are in analyzing matrices we need to apply again and again.

In this post, we're going to dive into one of the most famous applications of Eigenvectors - the original PageRank algorithm that allowed Google to create the world's best search engine. PageRank was Larry Page's phD thesis while at Stanford. He spun out this amazing algorithm to redefine search and create one of the most iconic companies in the modern era. We're going to re-derive it here and see that it's just a simple application of Eigenvectors!

By the end of this post, you will:

Feel like you could have come up with PageRank yourself.
Understand how eigenvectors are central to PageRank.
See how central eigenvectors are to dynamic processes in general.

Making Friends

Ok let's build out our understanding of pagerank using a hypothetical scenario.

Let's say you just moved to a new city and you’re trying to make some new friends. You need to figure out which of your new friends to hang out with in your free time. But unfortunately you’re super busy and have very little free time!

So you need to rank all these potential friends to prioritize which ones to chill with.

This could be you! Just need to find the friends...

In a moment of clarity you realize the following:

Friends you’re close with are probably going to hang out with people you’d like.
Someone everyone is friends with is probably someone you are going to be friends with.

You decide to use these two insights to create a scheme you call FriendRank.

FriendRank gives a potential friend a score (FriendScore) based on:

How many people are friends with this person weighted by,
How close you are to the people who are friends with them.

In the above picture, how close you are to a person is shown by the size of the circle. To find your friend score for potential friend

F

, you just add up how close you are to

A

C

, and

D

Let’s write this down as a formula. Let’s say the strength of the friendship between you and some person

p

F S (p)

. Let

F r i e n d s (p)

be the set of friends of

p

The FS (FriendScore) for a potential friend Fred is:

F S (F r e d) = \sum_{p \in F r i e n d s (F r e d)} F S (p)

Calculating Our First Friend Score

Let’s work through an example. Say you have 4 potential friends:

Jimmy (closeness score - 5)
Maya (closeness score - 9)
Lisa (closeness score - 8)
Fred (?)

In this world, you’re really close to Maya (with a score of 9) and Lisa (8) but only kinda close to Jimmy (5).

Only Jimmy and Maya are friends with Fred.

Graphically, this is:

Only Jimmy and Maya claim to be friends with Fred. Lisa seems to be tearing him apart.

So:

F S (F r e d) = F S (J i m m y) + F S (M a y a)

F S (F r e d) = 5 + 9 = 14

Accounting For People who are Friends With Everyone

Not such a crazy idea right? You start this process and you realize one thing. Your really close friend Maya claims to be friends with EVERYONE. You start wondering, "Hmm maybe I need to downweight Maya’s opinion because she doesn’t really seem to be that picky”.

This is easy - we just divide Maya’s weighting score by how many people she claims to be friends with. That way if Maya claims 100 friends, his weighting for each friend is divided by 100. So our formula is now:

F S (F r e d) = \sum_{p \in F r i e n d s (F r e d)} F S (p) \cdot \frac{1}{N u m F r i e n d s (p)}

Going back to our example, our score becomes:

F S (F r e d) = \frac{F S (J i m m y)}{2} + \frac{F S (M a y a)}{4}

F S (F r e d) = \frac{5}{2} + \frac{9}{4} = 4.75

Awesome!

Let’s now simplify this formula a little.

Let’s call the set of all friends $F$ .
Let $I s F r i e n d s (p, F r e d) = {\begin{cases} 1 & if p calls Fred a friend \\ 0 & otherwise \end{cases}$

Then we can simplify our earlier formula as shown below:

F S (F r e d) = \sum_{p \in F r i e n d s (F r e d)} F S (p) \cdot \frac{1}{N u m F r i e n d s (p)}

F S (F r e d) = \sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, F r e d)}{N u m F r i e n d s (p)}

All the people who aren’t friends with Fred just get $0$ terms which gives us the same thing.

So Who's Friends With Whom?

Ok this is looking great! You decide to go ahead and implement this scheme with your friends Bobby, Lauren, Daniel, Rony, and Saba. You start by asking all your friends to write down who they’re friends with! So you send them all the following email:

From: parry.lage@boogle.com

To: Bobby, Lauren, Daniel, Rony, Saba

Subject: Who are your friends?

My dearest "friends",

I’m trying to rank you and all my other potential friends and need your help to do so. Below is a list of people. Please put a 1 next to their name if you call them your friend.

Person Friends?

Bobby ___

Lauren ___

Daniel ___

Rony ___

Saba ___

Yours truly,

Parry Lage

P.S. Just because I'm sending you this email doesn't mean we're friends.

Person	Friends?
Bobby	___
Lauren	___
Daniel	___
Rony	___
Saba	___

So Saba sends you something like this:

Person	IsFriends
Bobby	1
Lauren	1
Daniel	0
Rony	0
Saba	0

Looking good so far! You then remember you need to downweight Saba’s score by how many friends she claims to have. She’s claiming 2 friends here so we divide by 2 giving us:

Person	IsFriends
Bobby	1/2
Lauren	1/2
Daniel	0
Rony	0
Saba	0

Sweet.

You now decide to organize all your friends responses into a little table so that it’s easy to see. Each column of the table is a different friend. It looks like this (notice Saba's answers above are the last column in the table below):

	Bobby	Lauren	Daniel	Rony	Saba
Bobby	0	1	1/3	0	1/2
Lauren	1/4	0	1/3	0	1/2
Daniel	1/4	0	0	0	0
Rony	1/4	0	0	0	0
Saba	1/4	0	1/4	1	0

This table is just a matrix, which we'll call

A

. The element

i, j

in the matrix is

A [i, j] = \frac{I s F r i e n d s (j, i)}{N u m F r i e n d s (j)}

For example, let's examine

A [1, 5]

Person $1$ is Bobby and Person $5$ is Saba.
$I s F r i e n d s (5, 1)$ is whether Saba claims to be friends with Bobby. We saw in her response she did, so $I s F r i e n d s (5, 1) = 1$ .
$N u m F r i e n d s (5)$ is the number of friends Saba has (2). Hence $N u m F r i e n d s (5) = 2$ .
So $A [1, 5] = \frac{1}{2}$ just as it is in the table.

Notice additionally that

I s F r i e n d s

is not reflexive! For instance, Rony may claim to be friends with Saba, but in the above example, Saba doesn't call Ron a friend.

Hence,

I s F r i e n d s (R o n, S a b a) = 1

, but

I s F r i e n d s (S a b a, R o n) = 0

Adding Your Own Scores

You then decide you want to come up with an
initial guess for everyone's FriendScore!

There are some people you know better than others so this is really a guess for what your final FriendScore is for each of them. You write down the following and call it

F S_{0}

- your initial FriendScore:

Person	FriendScore
Bobby	1
Lauren	1
Daniel	1
Rony	1
Saba	1

Combining Everyone's Opinion

Omg this is a lot more work than I expected.

Ok we now have the following:

$A$ - a Matrix who's elements are $A [i, j] = \frac{I s F r i e n d s (j, i)}{N u m F r i e n d s (j)}$ .
$F S_{0}$ - An initial guess of everyone's FriendScore.

Now that we know who's friends with who, we can figure out who's really likely to be our friend.

We'd like to come up with a new FriendScore

F S_{1}

based on this information.

How do we do that?

Recall from earlier our formula for Fred's FriendScore:

$F S (F r e d) = \sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, F r e d)}{N u m F r i e n d s (p)}$

So similarly, we want to update our rankings,

F S_{1}

, for our friends to be:

Person	FriendScore
Bobby	$\sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, B o b b y)}{N u m F r i e n d s (p)}$
Lauren	$\sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, L a u r e n)}{N u m F r i e n d s (p)}$
Daniel	$\sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, D a n i e l)}{N u m F r i e n d s (p)}$
Rony	$\sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, R o n y)}{N u m F r i e n d s (p)}$
Saba	$\sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, S a b a)}{N u m F r i e n d s (p)}$

Let's shorten our notation:

Let $I F$ be $I s F r i e n d s$
Let $N F$ be $N u m F r i e n d s$

Then Bobby’s ideal new ranking is:

\sum_{p \in F} \frac{I F (p, B o b b y)}{N F (p)} \cdot F S (p)

This looks an awful lot like this dot product (go ahead and compute it!):

[\begin{matrix} \frac{I F (B o b b y, B o b b y)}{N F (B o b b y)} & \frac{I F (L a u r e n, B o b b y)}{N F (L a u r e n)} & \frac{I F (D a n i e l, B o b b y)}{N F (D a n i e l)} & \frac{I F (R o n y, B o b b y)}{N F (R o n y)} & \frac{I F (S a b a, B o b b y)}{N F (S a b a)} \end{matrix}] \cdot [\begin{matrix} F S (B o b b y) \\ F S (L a u r e n) \\ F S (D a n i e l) \\ F S (R o n y) \\ F S (S a b a) \end{matrix}]

Breaking Down the Dot Product - First Vector

Let's try and see if we have the two vectors in this dot product. Let's start with the first vector:

[\begin{matrix} \frac{I F (B o b b y, B o b b y)}{N F (B o b b y)} & \frac{I F (L a u r e n, B o b b y)}{N F (L a u r e n)} & \frac{I F (D a n i e l, B o b b y)}{N F (D a n i e l)} & \frac{I F (R o n y, B o b b y)}{N F (R o n y)} & \frac{I F (S a b a, B o b b y)}{N F (S a b a)} \end{matrix}]

Where can we find this vector?

Remember the elements of

A

are just

A [i, j] = \frac{I F (j, i)}{N F (j)}

. Let's look at

A

again, substituting the first row of the matrix with the formula representation - we'll see that it's the same as this vector!

	Bobby	Lauren	Daniel	Rony	Saba
Bobby	$\frac{I F (B o b b y, B o b b y)}{N F (B o b b y)}$	$\frac{I F (L a u r e n, B o b b y)}{N F (L a u r e n)}$	$\frac{I F (D a n i e l, B o b b y)}{N F (D a n i e l)}$	$\frac{I F (R o n y, B o b b y)}{N F (R o n y)}$	$\frac{I F (S a b a, B o b b y)}{N F (S a b a)}$
Lauren		-	-	-	-
Daniel	-	-	-	-	-
Rony	-	-	-	-	-
Saba	-	-	-	-	-

Great so we have the first vector in the dot product.

Breaking Down the Dot Product - Second Vector

Let's now see if we can get the second one:

[\begin{matrix} F S (B o b b y) \\ F S (L a u r e n) \\ F S (D a n i e l) \\ F S (R o n y) \\ F S (S a b a) \end{matrix}]

You've already created this vector when you came up with your initial guesses $F S_{0}$ in the following table:

Person	FriendScore
Bobby	1
Lauren	1
Daniel	1
Rony	1
Saba	1

So we have both vectors of the dot product needed for finding

F S (B o b b y)

The first vector is just the first row of $A$ .

The second vector is $F S_{0}$ .

Matrix Multiplication To Find FriendScores

If we just multiply

A \cdot F S_{0}

we compute this dot product and get

F S (B o b b y)

in the first row of the resulting vector.

A \cdot F S_{0} = [\begin{matrix} \sum_{p \in F} F S (p) \cdot \frac{I s F r i e n d s (p, B o b b y)}{N u m F r i e n d s (p)} \\ - \\ - \\ - \\ - \end{matrix}] = [\begin{matrix} F S (B o b b y) \\ - \\ - \\ - \\ - \end{matrix}]

More generally, you'll find that the second row of this vector is exactly

F S (L a u r e n)

etc, giving us:

A \cdot F S_{0} = [\begin{matrix} F S (B o b b y) \\ F S (L a u r e n) \\ F S (D a n i e l) \\ F S (R o n y) \\ F S (S a b a) \end{matrix}]

So, if we call

F S_{1}

the result of folding in the results of

A

into your initial predictions, we get:

A \cdot F S_{0} = F S_{1}

When Do We Stop?

Ok so we now have a super quick way of finding our next FriendScore. Just take our current guess, which we’ll call

F S_{i}

, multiply it by

A

, and obtain

F S_{i + 1}

F S_{i + 1} = A \cdot F S_{i}

Everytime we compute

A \cdot F S_{i}

, we are updating our guess of our Friend Score with the rest of our friends opinions.

When do we stop this process? We stop when

F S_{i + 1}

stops changing:

F S_{i + 1} = F S_{i} = F S^{*}

At this point, which we call

F S^{*}

A \cdot F S^{*} = F S^{*}

Does this look like the following?

A x = λ x

Yes!
This is exactly the formula for an eigenvector!
So the final solution

F S^{*}

, is an eigenvector of

A

Seeing This Visually

How does

F S_{i}

move to

F S^{*}

?
In many ways, the matrix

A

pulls

F S_{i}

towards its eigenvector

F S^{*}

There’s an awesome visualization of this here by setosa.io that I highly recommend you check out. Here’s an extract directly from their writeup. In the below image,

s 1

is an eigenvector of

A

. Notice how as we keep applying

A

v

, we move closer to

s 1

. In particular, the sequence

v

A v

A^{2} v

, ... eventually ends up directly on

s 1

As you keep applying

A

v

v

gets pulled towards the eigenvector

s_{1}

Similarly in FriendRank, as we keep multiplying

F S_{0}

A

, we get the sequence

A \cdot F S_{0}

A^{2} \cdot F S_{0}

, ... which eventually ends up on

F S^{*}

- the eigenvector of

A

From Friends to Pages

You’ve probably guessed at this point that you can switch the word friends to pages in the above and get PageRank! In the world of webpages:

One webpage claims to be “friends” with another if it links out to the webpage.
Google finds out who’s friends with who (our initial survey) by just crawling the web and finding out who links to who.
The number of "friends" a webpage claims to have is just the number of links present on the webpage.
We create the matrix $A$ by crawling the web. Once we have it, we find its eigenvector to get an estimate of the PageRanks.

To get PageRank, just replace people with pages.

Putting this together gives us a way of ranking all webpages on the internet for relevance!

I find it really cool how PageRank models the abstract concept of webpages using simple insights into how we interact as humans.

A few quick asides:

The formula we've derived for PageRank is a simplified version of the actual one. To see the full formula, read the original paper here.
Notice that PageRank is actually a next level pun. PageRank was written by Larry Page about ranking webpages - well done Larry.

Conclusion

So we’ve seen that eigenvectors are at the heart of one of the most important algorithms of all time! More generally, whenever you see a linear function being applied again and again and again (Like

A

), you are without a doubt going to be looking for the eigenvectors of that linear function/matrix. The eigenvectors will tell you where that process ends!

I hope you enjoyed this unique and powerful application of eigenvectors. In the next post kernels, we're going to step back into Matrix Algebra. We'll learn about Kernels (or Nullspaces) and visually see how they help us immediately grasp what matrices do to their inputs.

BY Dhruv Parthasarathy(Director of AI Programs @ Udacity)

P a g e R a n k P a g e R a n k PageRankPage\,RankPageRank